Goto

Collaborating Authors

 model complexity


Simple and Efficient Heterogeneous Temporal Graph Neural Network

Neural Information Processing Systems

Heterogeneous temporal graphs (HTGs) are ubiquitous data structures in the real world. Recently, to enhance representation learning on HTGs, numerous attention-based neural networks have been proposed. Despite these successes, existing methods rely on a decoupled temporal and spatial learning paradigm, which weakens interactions of spatio-temporal information and leads to a high model complexity. To bridge this gap, we propose a novel learning paradigm for HTGs called Simple and Efficient Heterogeneous Temporal Graph Neural Network (SE-HTGNN). Specifically, we innovatively integrate temporal modeling into spatial learning via a novel dynamic attention mechanism, which substantially reduces model complexity while enhancing discriminative representation learning on HTGs. Additionally, to comprehensively and adaptively understand HTGs, we leverage large language models to prompt SE-HTGNN, enabling the model to capture the implicit properties of node types as prior knowledge. Extensive experiments demonstrate that SE-HTGNN achieves up to 10 speed-up over the state-of-the-art and latest baseline while maintaining the best forecasting accuracy.


A Rigorous, Tractable Measure of Model Complexity

arXiv.org Machine Learning

One of the most fundamental properties of a machine learning model is its complexity, with applications across topics such as interpretation, generalization, and model selection. Despite its importance, there is no canonical, model-agnostic way to assess a model's complexity. While simple heuristics, such as the number or magnitude of parameters, yield very crude estimates, hyperparameter-based approaches, such as polynomial degree or kernel length scale, do not generalize across model classes. More rigorous methods, including the Vapnik-Chervonenkis dimension (VCD) (Vapnik, 2013), Rademacher complexity (RMC) (Bartlett and Mendelson, 2002), and effective number of parameters (or effective degrees of freedom, ENP) (Efron, 1986), are difficult, or even impossible, to compute in practice, leaving the user to resort to crude bounds and/or approximations. The topic is further complicated by the often overlooked distinction between model and function complexity, where the former sets a ceiling on the latter.


Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions

Neural Information Processing Systems

Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g.





Transfer Learning via Minimizing the Performance Gap Between Domains

Neural Information Processing Systems

To address this issue, we present the first analysis for instance weighting transfer learning that considers the presence of labeled target examples. The contribution of our work is two-fold.1. We address the question ofhow to measure the divergence between two domains given label informationforthetargetdomain.




AMathematicalFrameworkforQuantifying TransferabilityinMulti-sourceTransferLearning

Neural Information Processing Systems

Therefore, forasource task withacomplexmodel orfewtraining samples, even though itis similar to the target task, the knowledge transferable from this source task can still be verylimited.